13 research outputs found

    Generative Adversarial Networks for Bitcoin Data Augmentation

    Get PDF
    In Bitcoin entity classification, results are strongly conditioned by the ground-truth dataset, especially when applying supervised machine learning approaches. However, these ground-truth datasets are frequently affected by significant class imbalance as generally they contain much more information regarding legal services (Exchange, Gambling), than regarding services that may be related to illicit activities (Mixer, Service). Class imbalance increases the complexity of applying machine learning techniques and reduces the quality of classification results, especially for underrepresented, but critical classes. In this paper, we propose to address this problem by using Generative Adversarial Networks (GANs) for Bitcoin data augmentation as GANs recently have shown promising results in the domain of image classification. However, there is no "one-fits-all" GAN solution that works for every scenario. In fact, setting GAN training parameters is non-trivial and heavily affects the quality of the generated synthetic data. We therefore evaluate how GAN parameters such as the optimization function, the size of the dataset and the chosen batch size affect GAN implementation for one underrepresented entity class (Mining Pool) and demonstrate how a "good" GAN configuration can be obtained that achieves high similarity between synthetically generated and real Bitcoin address data. To the best of our knowledge, this is the first study presenting GANs as a valid tool for generating synthetic address data for data augmentation in Bitcoin entity classification.Comment: 8 pages, 5 figures, 4 table

    Cascading Machine Learning to Attack Bitcoin Anonymity

    Full text link
    Bitcoin is a decentralized, pseudonymous cryptocurrency that is one of the most used digital assets to date. Its unregulated nature and inherent anonymity of users have led to a dramatic increase in its use for illicit activities. This calls for the development of novel methods capable of characterizing different entities in the Bitcoin network. In this paper, a method to attack Bitcoin anonymity is presented, leveraging a novel cascading machine learning approach that requires only a few features directly extracted from Bitcoin blockchain data. Cascading, used to enrich entities information with data from previous classifications, led to considerably improved multi-class classification performance with excellent values of Precision close to 1.0 for each considered class. Final models were implemented and compared using different machine learning models and showed significantly higher accuracy compared to their baseline implementation. Our approach can contribute to the development of effective tools for Bitcoin entity characterization, which may assist in uncovering illegal activities.Comment: 15 pages,7 figures, 4 tables, presented in 2019 IEEE International Conference on Blockchain (Blockchain

    Visual Analytics Platform for Centralized COVID-19 Digital Contact Tracing

    Get PDF
    The COVID-19 pandemic and its dramatic worldwide impact has required global multidisciplinary actions to mitigate its effects. Mobile phone activity-based digital contact tracing (DCT) via Bluetooth low energy technology has been considered a powerful pandemic monitoring tool, yet it sparked a controversial debate about privacy risks for people. In order to explore the potential benefits of a DCT system in the context of occupational risk prevention, this article presents the potential of visual analytics methods to summarize and extract relevant information from complex DCT data collected during a long-term experiment at our research center. Visual tools were combined with quantitative metrics to provide insights into contact patterns among volunteers. Results showed that crucial actors, such as participants acting as bridges between groups could be easily identified—ultimately allowing for making more informed management decisions aimed at containing the potential spread of a disease.This research work has been carried out within the context of the RAPIDm initiative, fostered by the Basque Government as part of the fast reaction program (PRAP Euskadi, led by SPRI—the entity of the Economic Development, Sustainability, and Environment Department of the Basque Government for promoting the Basque industry) with the aim to boost the Basque industrial sector by maintaining the productive activity in the context of the threat of the COVID-19 pandemic. Three research centers of BRTAn (Basque Research and Technology Alliance) have collaborated in this R&D initiative: Tecnalia, Ikerlan, and Vicomtech. Among the different research lines carried out in the RAPID initiative, Vicomtech has been responsible for the centralized BLE-based DCT system and visual analytics of the obtained data which has been selected as one of the representative cases by the OECDo of pandemic reaction report

    12 Temporal graph-based approach for behavioural entity classification

    No full text
    Graph-based analyses have gained a lot of relevance n the past years due to their high potential in describing complex systems by detailing the actors involved, their relations and their behaviours. Nevertheless, in scenarios where these aspects are evolving over time, it is not easy to extract valuable information or to characterize correctly all the actors. In this study, a two phased approach for exploiting the potential of graph structures in the cybersecurity domain is presented. The main idea is to convert a network classification problem into a graph-based behavioural one. We extract these graph structures that can represent the evolution of both normal and attack entities and apply a temporal dissection approach in order to highlight their micro-dynamics. Further, three clustering techniques are applied to the normal entities in order to aggregate similar behaviours, mitigate the imbalance problem and reduce noisy data. Our approach suggests the implementation of two promising deep learning paradigms for entity classification based on Graph Convolutional Networks

    Bitcoin and cybersecurity: temporal dissection of blockchain data to unveil changes in entity behavioral patterns

    Get PDF
    The Bitcoin network not only is vulnerable to cyber-attacks but currently represents the most frequently used cryptocurrency for concealing illicit activities. Typically, Bitcoin activity is monitored by decreasing anonymity of its entities using machine learning-based techniques, which consider the whole blockchain. This entails two issues: first, it increases the complexity of the analysis requiring higher efforts and, second, it may hide network micro-dynamics important for detecting short-term changes in entity behavioral patterns. The aim of this paper is to address both issues by performing a 'temporal dissection' of the Bitcoin blockchain, i.e., dividing it into smaller temporal batches to achieve entity classification. The idea is that a machine learning model trained on a certain time-interval (batch) should achieve good classification performance when tested on another batch if entity behavioral patterns are similar. We apply cascading machine learning principles'a type of ensemble learning applying stacking techniques'introducing a 'k-fold cross-testing' concept across batches of varying size. Results show that blockchain batch size used for entity classification could be reduced for certain classes (Exchange, Gambling, and eWallet) as classification rates did not vary significantly with batch size; suggesting that behavioral patterns did not change significantly over time. Mixer and Market class detection, however, can be negatively affected. A deeper analysis of Mining Pool behavior showed that models trained on recent data perform better than models trained on older data, suggesting that 'typical' Mining Pool behavior may be represented better by recent data. This work provides a first step towards uncovering entity behavioral changes via temporal dissection of blockchain data.This work was partially funded by the European Commission through the Horizon 2020 research and innovation program, as part of the 'TITANIUM' project (Grant Agreement No. 740558)
    corecore